Search Query Syntax


    KEY:

    1. Lower case words are meta-definitions (i.e. non-terminals defined 
       in terms of terminals and other non-terminals).

    2. UPPER case words are terminal symbol KEY or reserved words.

    3. '|' implies "OR" if it begins a line. A character enclosed in single 
       quotes (' ') is a literal terminal symbol of that single character. 
       The '*' symbol means the previous object repeated 0 or more times. 
       All other punctuation are literal terminal symbols.

    4. A definition in comments /* ... */ is an English explanation or
       a regular expression definition.

    ---------------------------------------------------------------------

      query                 ::= set_query
                                | set_query SET range-list

      set-query             ::= term
                                | set-query AND term
                                | set-query BUTNOT term

      term                  ::= field-item
                                | & field-item
                                | term OR field-item
                                | & term OR field-item
                                
      field-item            ::= compare-condition
                                | between-condition
                                | proximate-condition

      compare-condition     ::= field-spec comp-op word
 
      comp-op               ::= = | != | < | <= | > | >=

      between-condition     ::= field-spec BETWEEN word , word
                                | field-spec OUTSIDE word , word
   
      proximate-condition   ::= phrase-term
                                | phrase-term PROXIMITY distance   

      distance              ::= constant group-unit

      group-unit            ::= WORD[S]
                                | SENTENCE[S]
                                | PARAGRAPH[S]
                                | DOCUMENT[S]

      phrase-term           ::= phrase-list
                                phrase-term phrase-term-op phrase-list

      phrase-term-op        ::= +
                                | ~

      phrase-list           ::= field-phrase
                                | phrase-list , field-phrase

      field-phrase          ::= phrase
                                | field-spec: phrase
                                | phrase IN field-spec

      phrase                ::= phrase-item
                                ( set-query )

      phrase-item           ::= approx-word
                                | phrase-item order-op approx-word

      order-op              ::= ' '
                                | -
    
      field-spec            ::= DICTIONARY constant field-list
                                | DRI constant field-list
                                | XPATH path-spec
                                | TAG path-spec
                                | field-list

      path-spec             ::= tag-spec
                               "tag-spec"

      tag-spec              ::= //tag-list
                                /tag-list
                                tag-list
      
      tag-list              ::= tag
                                @tag
                                tag-list / tag

      field-list            ::= ALL
                                | FIELD[S] aalist-spec

      aalist-spec           ::= aalist-item
                                aalist-spec, aalist-item 

      aalist-item           ::= constant
                                | ~ constant
                                | constant[aaval-spec]
                                | ~ constant[aaval-spec]

      aaval-spec            ::= aaval-spec-item 
                                | aaval-spec , aaval-spec-item

      aaval-spec-item       ::= constant
                                | constant .. constant

      approx-word           ::= word 
                                | @ word

      word                  ::= real-word
                                | "exact-order-phrase"
                                | 'literal-phrase'

      real-word             ::= numeric
                                | id
                                | id'*'
                                | id'?'*
    
      range-list            ::= constant-list
                                | BETWEEN constant, constant
                                | OUTSIDE constant, constant

      constant-list         ::= constant
                                | constant-list , constant

      exact-order-phrase    ::= real-word
                                | real-word' 'exact-order-phrase

      literal-phrase        ::= /* Any character, including blanks and dashes */

      constant              ::= /* [0-9]+ */

      id                    ::= /* [A-Z][A-Z0-9_]* , or if in quotes,
                                   can be any non-blank character */

      numeric               ::= /* constant or floating pt. # (e.g. 1.23) */

      tag                   ::= /* XML generic ID */
   
    ------------------------------------------------------------------------

    EXPLANATIONS:

    A.          A <real-word> definition can vary from database to database.
                Right truncation style wildcards are supported along with embedded
                wildcards (i.e. one or more wildcard characters along with
                specific characters).
 
                The representation may vary according to word definition, 
                but will support the following level of functionality:

                Form 1: xxx*  <- matches ALL words starting with "xxx"

                Form 2: xxx?? <- matches ALL words starting with "xxx"
                                 and are of length <= 5. 

                Form 3: xx*yy <- matches ALL words starting with "xx"
                                 and ending in "yy".

                Form 4: x??y* <- matches ALL words starting with "x",
                                 followed by two arbitrary chracters, then
                                 a "y", then ending with 0 or more extra
                                 characters.

    B.          A <constant> is an integer (e.g. 47).

    C.          A quoted word is not interpreted, so anything inside
                will constitute a word except for a blank (' '). 
                Blanks inside a double quoted string will be construed as
                a separator denoting a string of words to find in
                exact order. Some implementations may have this feature
                turned off. Strings inside a single quote phrase 
                have NO characters interpreted, including the blank (' ')
                character.

    D.          A NOT condition can not form a single term.
               
    E.          Sentences may or may not be implemented in a specific 
                application.

    F.          The '&' symbol is an anchor. It forces a term to be
                evaluated first instead of the MIN-term order.

    G.          The @ symbol means to find the word CLOSEST to the 
                given word.

    H.          The "~" symbol means the NOT or EXCLUDE operator.

    I.          Note the difference between the 'AND' operator and '+'.
                The AND operator applies to an entire document whereas
                the '+' operator is constrained by the PROXIMITY
                expression. If no proximity is applied, then the two
                operators are equivalent.

    EXAMPLES

    Note that in the following examples, #defines are used to give meaning
    to fields and are NOT part of the query syntax.

     To overcome precedence (or make query explicit) parentheses may be used:

     query ::= (bob OR ray) AND comedy BUTNOT hope

     (door + garage) proximity 4 words
     (door + concrete + heavy) proximity 2 paragraphs

     query ::= FIELD 4 BETWEEN 1.0, 1.5

     query ::= FIELD 2,3 > 19570426

     Note: This query uses an integer (32 bit) collating sequence 
           for representing dates (Y2k compliant).

     #define CATALOG 9 
     #define MODEL   10
     #define PARTNO  47

     query ::= FIELD CATALOG > 47 AND FIELD MODEL BETWEEN 4,8 
               AND PARTNO = 1145 SET 1,10

     #define Author DRI 2 Field 8[10]

     query ::= desert-fox IN ALL OR Erwin-Rommel IN Author.

     query ::= "bob hope" in field ~8

     <name>Schmitt<first>steve</first</name>
     query ::= steve IN TAG "name/first"

     <name MI = "J">Schmitt<first>Steve</first></name>

     query ::= "J" in tag "name/@mi"

   More XML XPATH examples:

   Given the XML markup:

   <employee-record>
      <identification>
        <name nickname = "buddy">
          <first>joe</first><middle>bob</middle><last>thornton</last>
        </name>
        <ss number = 366546660></ss>
        <age>47</age>
      </identification>
      <status>
         disabled
      </status>
      <salary>
        40000
      <salary>
   </employee-record>


   The following example searches use path notation to describe the record: